robust risk
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States (0.04)
- Health & Medicine (0.46)
- Information Technology (0.45)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- North America > United States > California (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Government (0.67)
- Health & Medicine (0.46)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > Canada (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Achievable distributional robustness when the robust risk is only partially identified
In safety-critical applications, machine learning models should generalize well under worst-case distribution shifts, that is, have a small robust risk. Invariance-based algorithms can provably take advantage of structural assumptions on the shifts when the training distributions are heterogeneous enough to identify the robust risk. However, in practice, such identifiability conditions are rarely satisfied - a scenario so far underexplored in the theoretical literature. In this paper, we aim to fill the gap and propose to study the more general setting of partially identifiable robustness. In particular, we define a new risk measure, the identifiable robust risk, and its corresponding (population) minimax quantity that is an algorithm-independent measure for the best achievable robustness under partial identifiability. We introduce these concepts broadly, and then study them within the framework of linear structural causal models for concreteness of the presentation. We use the introduced minimax quantity to show how previous approaches provably achieve suboptimal robustness in the partially identifiable case. We confirm our findings through empirical simulations and real-world experiments and demonstrate how the test error of existing robustness methods grows increasingly suboptimal as the proportion of previously unseen test directions increases.
A Closed form expressions for the robust risks
In Section A.1 and A.2 we derive closed-form expressions of the standard and robust risks from We first prove Equation (13). We now prove the second part of the statement. In this section we provide additional details on our experiments. B.1 Neural networks on sanitized binary MNIST If not mentioned otherwise, we use noiseless i.i.d. C.1 we give an intuitive explantion for the robust overfitting phenomenon described in C.2 we discuss how inconsistent adversarial training prevents We now shed light on the phenomena revealed by Theorem 3.1 and Figure 2. In particular, we In this section we further discuss robust logistic regression studied in Section 4. As observed in Section 4.4, label noise can prevent interpolation and hence improve the robust risk Hence, inconsistent training perturbations can induce spurious regularization effects.
Understanding Robust Machine Learning for Nonparametric Regression with Heavy-Tailed Noise
We investigate robust nonparametric regression in the presence of heavy-tailed noise, where the hypothesis class may contain unbounded functions and robustness is ensured via a robust loss function $\ell_σ$. Using Huber regression as a close-up example within Tikhonov-regularized risk minimization in reproducing kernel Hilbert spaces (RKHS), we address two central challenges: (i) the breakdown of standard concentration tools under weak moment assumptions, and (ii) the analytical difficulties introduced by unbounded hypothesis spaces. Our first message is conceptual: conventional generalization-error bounds for robust losses do not faithfully capture out-of-sample performance. We argue that learnability should instead be quantified through prediction error, namely the $L_2$-distance to the truth $f^\star$, which is $σ$-independent and directly reflects the target of robust estimation. To make this workable under unboundedness, we introduce a \emph{probabilistic effective hypothesis space} that confines the estimator with high probability and enables a meaningful bias--variance decomposition under weak $(1+ε)$-moment conditions. Technically, we establish new comparison theorems linking the excess robust risk to the $L_2$ prediction error up to a residual of order $\mathcal{O}(σ^{-2ε})$, clarifying the robustness--bias trade-off induced by the scale parameter $σ$. Building on this, we derive explicit finite-sample error bounds and convergence rates for Huber regression in RKHS that hold without uniform boundedness and under heavy-tailed noise. Our study delivers principled tuning rules, extends beyond Huber to other robust losses, and highlights prediction error, not excess generalization risk, as the fundamental lens for analyzing robust learning.
- North America > United States > New York (0.04)
- North America > United States > Tennessee > Knox County > Knoxville (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)